Skip to content

refactor(runner): route flashduty auth through bash extraEnv (drops flashduty_exec op)#50

Merged
ysyneu merged 6 commits into
mainfrom
feat/flashduty-exec
May 28, 2026
Merged

refactor(runner): route flashduty auth through bash extraEnv (drops flashduty_exec op)#50
ysyneu merged 6 commits into
mainfrom
feat/flashduty-exec

Conversation

@ysyneu
Copy link
Copy Markdown
Collaborator

@ysyneu ysyneu commented May 27, 2026

Summary

Two changes that make the Flashduty CLI work cleanly under the runner's bash tool:

  1. Auth via bash extraEnv (drops the dedicated flashduty_exec wire op).
  2. Hermetic CLI via PATH precedence — the runner's own bundled flashduty always wins over any copy the BYOC host happens to have on PATH.

Paired with fc-safari PR #74 (the bash guard that injects auth and rejects credential read-back).

1. Auth via bash extraEnv (supersedes flashduty_exec)

Safari recognizes flashduty CLI invocations in bash commands, injects FLASHDUTY_APP_KEY into the bash payload's env map for that one subprocess, and the runner just honors env like it already does for bash. This supersedes the earlier Phase 2 design (dedicated runner op + dedicated safari tool) — see fc-safari PR #74 for the rationale.

Removed:

  • protocol.TaskOpFlashdutyExec + FlashdutyExecArgs (wire op no longer needed)
  • case TaskOpFlashdutyExec: dispatch in ws/handler.go
  • Environment.FlashdutyExec() public + executeFlashdutyExec() impl
  • Two unit tests covering the deleted argv-isolated path

Kept (with tightened comments):

  • scrubFlashdutySecrets on executeBashCommand — defense-in-depth so the runner's own ambient FLASHDUTY_* never leaks into a bash subprocess on BYOC dev workstations where the operator's shell exports them. Safari's per-call extraEnv overlay still wins the merge for genuine CLI invocations because extraEnv is layered last.
  • All output capture / 10MB cap / .outputs/ spill semantics unchanged.

Added:

  • TestEnvironment_Bash_ExtraEnvFlashdutyInjection pins the overlay-wins-scrub contract end-to-end on the bash path.

Wire shape

Safari sends a normal bash payload; the new bit is the optional env:

{
  "command": "flashduty incident list --severity Critical --progress Triggered,Processing --output-format toon",
  "workdir": "...",
  "timeout": 120,
  "env": { "FLASHDUTY_APP_KEY": "..." }
}

The runner merges env on top of scrubFlashdutySecrets(os.Environ()) and hands the result to exec.CommandContext. The credential lives in exactly one process's envp and dies with it.

2. Hermetic CLI: bundled-tools PATH precedence (commit 6c898876)

Problem (Q2 from the design discussion): on a BYOC host the customer may already have their own flashduty on PATH (different version, different auth). If the bash subprocess resolves flashduty to the host copy, Safari's injected FLASHDUTY_APP_KEY would be handed to a binary we don't control, and version drift could break flag parsing.

Fix: executeBashCommand now prepends the runner's bundled-tools directory to the subprocess PATH, so our flashduty shadows any host copy:

  • bundledToolsDir()FLASHDUTY_RUNNER_BIN_DIR env override if set, else filepath.Dir(os.Executable()) (the directory the runner binary itself lives in, where the CLI is bundled alongside it).
  • withBundledToolsPath(env, dir) — folds the dir onto the front of the existing PATH= entry. Idempotent: no-op when the dir is already first; handles empty PATH, empty dir, and a missing PATH entry.

Added: environment/bundled_tools_path_test.go — table test covering prepend / already-first / absent / empty-value / empty-dir / not-first.

This is independent of the auth overlay: PATH resolution picks which binary runs; extraEnv supplies its credential.

Trade-off vs the dedicated tool

Argv isolation is gone — flashduty <verb> now runs inside bash -c "..." so a malicious model could theoretically pipe stdout through xxd or similar. Safari's bash guard (PR #74) rejects the obvious leak paths:

  • Any command referencing FLASHDUTY_APP_KEY / FLASHDUTY_API_KEY by name
  • Bulk env dumps (env, printenv, compgen -e, /proc/*/environ)

Residual risk is accepted for BYOC/sandbox where the runner already runs under customer-owned credentials in a customer-controlled environment.

Test plan

  • go test ./... -count=1 — all green (envd / environment / permission, incl. the new PATH-precedence table)
  • go build ./... && go vet ./... clean
  • CI: all jobs green
  • E2E via paired safari PR — see PR #74 for the captured SSE transcripts

ysyneu added 5 commits May 28, 2026 00:16
…ction

Adds executeFlashdutyExec that fork-execs the flashduty CLI with per-call
auth env, isolated from the generic bash code path. Routes TaskOpFlashdutyExec
through the WebSocket handler. Phase 2 of fc-safari CLI adoption.
Phase 2 contract is that only the `flashduty_exec` path carries per-user
Flashduty credentials (FLASHDUTY_APP_KEY, FLASHDUTY_API_BASE, ...). The
generic `bash` tool must not see them. The previous executeBashCommand
inherited the runner's full os.Environ(), which leaked any FLASHDUTY_*
keys present in the runner process — easy to hit on a dev workstation
that exports them for safari, and impossible to audit in cloud sandboxes
when sandbox-manager forwards arbitrary entries.

Fix: scrubFlashdutySecrets() drops every FLASHDUTY_* entry from the
inherited env before bash sees it. Caller-supplied extraEnv layers on
top, so explicit hand-offs still work (test fixtures + non-secret
overrides unchanged). The flashduty_exec path is untouched — it relies
on safari's per-call extraEnv to re-add FLASHDUTY_APP_KEY in-place.

Caught in Phase 2 E2E: bash subprocess reported FLASHDUTY_APP_KEY=<key>
even though Phase 2 was supposed to remove it. Re-verified with the
scrub: bash sees zero FLASHDUTY_* entries; flashduty tool still returns
real incident data via its own auth pathway.
Two CI issues caught after the flashduty_exec PR push:

1. gofmt complained about the FlashdutyExecArgs struct alignment in
   protocol/messages.go — comment-column re-flow after the new fields.
2. windows-latest unit-test job failed because TestExecuteFlashdutyExec_
   VerbSplitsOnWhitespace shells out to /bin/echo to inspect argv. There
   is no PowerShell equivalent that preserves the same stdout shape, so
   the test is now skipped on GOOS=windows (mirrors the envd POSIX-only
   skips landed earlier).
…h extraEnv

Replaces the dedicated flashduty_exec operation with the simpler model
Safari's bash guard now drives: when Safari recognizes a `flashduty` CLI
invocation, it injects FLASHDUTY_APP_KEY into the bash payload's
`env` map for that one subprocess. The runner just honors extraEnv
on bash like it already does.

Removed surface:
- `TaskOpFlashdutyExec` const + `FlashdutyExecArgs` struct (protocol)
- `case TaskOpFlashdutyExec` dispatch (ws/handler.go)
- `Environment.FlashdutyExec()` public + `executeFlashdutyExec()` impl
- Two unit tests covering the deleted path

Kept (and tightened comments):
- `scrubFlashdutySecrets` on the bash path — defense-in-depth so the
  runner's own ambient FLASHDUTY_* never leaks into a bash subprocess
  even on BYOC dev workstations where the operator's shell exports them.
  Safari's per-call extraEnv overlay still wins the merge for genuine
  CLI invocations because extraEnv is layered last.
- All bash output capture / 10MB cap / .outputs spill semantics unchanged.

New test `TestEnvironment_Bash_ExtraEnvFlashdutyInjection` pins the
overlay-wins-scrub contract end-to-end on the bash path.

Trade-off vs the dedicated tool: argv-isolated `flashduty` invocations
no longer exist — the CLI runs inside a real `bash -c "..."` shell, so a
malicious model could in principle pipe through `xxd` or similar.
Safari's bash guard (separate PR) handles that by rejecting any command
that references `FLASHDUTY_APP_KEY` by name or bulk-dumps the env.
Residual risk is accepted for BYOC/sandbox isolation where the runner
already runs under customer-owned credentials in a customer-controlled
environment.
@ysyneu ysyneu changed the title feat(runner): add flashduty_exec operation for capability-bound auth refactor(runner): route flashduty auth through bash extraEnv (drops flashduty_exec op) May 28, 2026
…uty CLI)

A BYOC runner executes the agent's bash commands in the customer's own
environment, where `flashduty` may already be on PATH at a different version
— or be an unrelated binary of the same name. Resolving the CLI via the
ambient PATH is therefore not hermetic.

The runner now prepends its bundled-tools directory to every bash
subprocess's PATH so OUR flashduty shadows any same-named host binary.
The dir is FLASHDUTY_RUNNER_BIN_DIR when set, else the directory the runner
executable lives in (the cloud sandbox image and the host installer both
place the CLI next to the runner binary). No-op when the dir can't be
determined, so existing deployments without a bundled CLI are unaffected.

This makes renaming the CLI to a private name (e.g. `fduty`) unnecessary:
PATH precedence guarantees our version wins even on a bare `flashduty`
invocation, so the public brand name is preserved.

withBundledToolsPath is unit-tested across prepend / already-first /
absent-PATH / empty-value / empty-dir cases.

Note: shipping the CLI into the BYOC host install (install.sh) is tracked
separately — it needs a CLI-version-pinning policy decision. The cloud
sandbox already bakes the CLI next to the runner, so that path is complete.
@ysyneu ysyneu merged commit bafb82a into main May 28, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant